Weighting against homoplasy improves phylogenetic analysis of morphological data sets

نویسندگان

  • Pablo A. Goloboff
  • James M. Carpenter
  • J. Salvador Arias
  • Daniel Rafael Miranda Esquivel
  • Miguel Lillo
چکیده

The problem of character weighting in cladistic analysis is revisited. The finding that, in large molecular data sets, removal of third positions (with more homoplasy) decreases the number of well supported groups has been interpreted by some authors as indicating that weighting methods are unjustified. Two arguments against that interpretation are advanced. Characters that collectively determine few well-supported groups may be highly reliable when taken individually (as shown by specific examples), so that inferring greater reliability for sets of characters that lead to an increase in jackknife frequencies may not always be warranted. But even if changes in jackknife frequencies can be used to infer reliability, we demonstrate that jackknife frequencies in large molecular data sets are actually improved when downweighting characters according to their homoplasy but using properly rescaled functions (instead of the very strong standard functions, or the extreme of inclusion ⁄ exclusion); this further weakens the argument that downweighting homoplastic characters is undesirable. Last, we show that downweighting characters according to their homoplasy (using standard homoplasy-weighting methods) on 70 morphological data sets (with 50–170 taxa), produces clear increases in jackknife frequencies. The results obtained under homoplasy weighting also appear more stable than results under equal weights: adding either taxa or characters, when weighting against homoplasy, produced results more similar to original analyses (i.e., with larger numbers of groups that continue being supported after addition of taxa or characters), with similar or lower error rates (i.e., proportion of groups recovered that subsequently turn out to be incorrect). Therefore, the same argument that had been advanced against homoplasy weighting in the case of large molecular data sets is an argument in favor of such weighting in the case of morphological data sets. The Willi Hennig Society 2008. Character weighting in cladistics has traditionally been a controversial issue. Authors in favor of weighting had usually considered that characters with more homoplasy are less reliable. This was the basis of Farris’s (1969) successive weighting method and its noniterative descendants, implied weighting and autoweighted optimization (Goloboff, 1993, 1997; see general discussion in De Laet, 1997). However, Källersjö et al. (1999), in analyzing a large rbcL matrix (with about 2500 taxa, hereinafter rbcL-2500), found that the number of well supported groups and average jackknife resampling frequency, were strongly decreased when the most homoplastic characters (the third positions) were eliminated from the analysis. Källersjö et al. (1999) observed that ‘‘contrary to earlier expectations, increasing saturation and frequency of change ... actually improve the ability to recognize well-supported phylogenetic groups.’’ In addition to concluding that the common practice of eliminating third positions from phylogenetic analysis is probably pernicious, Källersjö et al. (1999, p. 93) also suggested that weighting methods that ‘‘rest on the idea that more homoplasy implies less reliability and less structure ... may not be well advised.’’ *Corresponding author: E-mail address: [email protected] The Willi Hennig Society 2008 Cladistics Cladistics 24 (2008) 1–16 10.1111/j.1096-0031.2008.00209.x As Källersjö et al. (1999) considered that their findings provided possible arguments against downweighting characters on the basis of homoplasy, Farris (2001) proposed an alternative method, support weighting. Support weighting relates reliability to the number of well-supported groups set off by the character (i.e., the number of well-supported groups for which changes in the character appear as synapomorphies), and is explicitly intended to estimate weights regardless of homoplasy. Farris (2001) tested the method on the jackknife tree for rbcL-2500, and it gave third positions (which discriminate more groups) higher weights than first and second (which discriminate very few groups). Although Källersjö et al. (1999) did not consider their results as providing evidence against weighting in general, other authors did. Even authors who had otherwise used only philosophical (or merely rhetorical) arguments to champion the exclusive and mandatory use of equal weights have referred to the empirical findings of Källersjö et al. with great approval (e.g., Grant and Kluge, 2005, p. 602; Kluge, 2005, p. 27). A reanalysis of rbcL-2500, presented below, shows that the groups supported by first and second positions, even if few, are compatible with those groups supported by third positions. In other words, even if first and second positions distinguish few groups, they do so reliably. The extrapolation from jackknife frequencies to reliability of individual characters is not justified. Furthermore, using implied weighting to analyze large molecular data sets improves jackknife frequencies (and associated measures), as long as the weighting strength is properly rescaled. In the case of morphological data sets, a trend opposite to that of Källersjö et al. (1999) had been documented before. Goloboff (1997; using 14 morphological data sets, with 14–47 taxa) showed that average jackknife frequencies were increased, relative to equal weights, when using either implied weighting, successive weighting, or self-weighted optimization. Ramı́rez (2003) also documented a similar trend. The present paper reports the most extensive comparison carried out to date between the results under equal and differential character weighting in morphological data sets. Our results show that (for morphological data), jackknife frequencies and other resampling measures are clearly improved when weighting against homoplasy. This is the same criterion that critics (e.g., Grant and Kluge, 2005, p. 602; Kluge, 2005, p. 27) had used to argue against weighting in the case of large molecular data sets. Therefore, defending equal weights on the basis of jackknife frequencies in the case of large molecular data sets requires—by the same logic—that weighting against homoplasy be defended in the case of morphological data sets. Kluge’s criticisms of weighting Kluge (1997a,b, 2005) has published the most prominent and vocal criticisms against weighting, pretending that he has rejected weighting on the basis of Popper’s ideas on falsification, and the very concepts of the nature of evidence and objectivity in science. Thus (the reader is led to conclude), those in favor of weighting oppose Popper, evidence, objectivity and scientific methods; but we do not. Kluge (1997b) repeats Turner and Zandee’s (1995) characterization of weighting as producing ‘‘unparsimonious’’ trees (simply by defining ‘‘most parsimonious’’ as ‘‘having fewest steps under equal weights’’), despite the fact that Goloboff (1995) had replied to exactly that same argument. Just like Turner and Zandee before, Kluge (1997b) appeals to Farris’s (1983) demonstration that parsimony maximizes explanatory power, and pretends that Farris (1983) showed that weighted hypotheses provide defective explanation. But if the support for arguments against weighting is supposed to come from Farris (1983), then there is no support at all: Farris (1983, p. 1011) had been (as noted by Goloboff, 1993, 1995) explicit that parsimony is not equivalent to equal weights, and that step counts must be weighted step counts, when some characters represent stronger evidence than others. Kluge (2005, p. 27) seems later to have realized this much, because he no longer cites Farris (1983) in support of the idea that weighting leads to unparsimonious hypotheses: ‘‘weighting leads to suboptimal, less-parsimonious, not more parsimonious, phylogenetic hypotheses when it comes to the data of observation (Kluge, 1997b; see, however, Farris, 1983).’’ It is especially ironic that Kluge (2005) cites Kluge (1997b) as providing justification for his statement regarding weighting and unparsimonious hypotheses, because there is no justification in Kluge (1997b) other than an appeal to Farris (1983). Kluge (1997b) also refers to Popper’s formula of corroboration, Ch;e;b 1⁄4 pðe;hbÞ pðe;bÞ pðe;hbÞ þ pðe;bÞ pðhe;bÞ (where C 1⁄4 corroboration, e 1⁄4 evidence, h 1⁄4 hypothesis, and b 1⁄4 background knowledge; see Farris, 1995 for discussion of this formula, and note here that Popper never meant his formula to be applied in real cases, as the actual values of the terms cannot be objectively measured; Popper simply used it to illustrate some general relationships between probability and corroboration). According to Kluge (1997b, p. 352): all of the justifications for differential character weighting ... follow a verificationist agenda—the application of weights supposedly improves one’s chances of discovering objective truth ... Weighting under any such guise negatively impacts C 2 P. A. Goloboff et al. / Cladistics 24 (2008) 1–16

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Character analysis in morphological phylogenetics: problems and solutions.

Many aspects of morphological phylogenetics are controversial in the theoretical systematics literature and yet are often poorly explained and justified in empirical studies. In this paper, I argue that most morphological characters describe variation that is fundamentally quantitative, regardless of whether they are coded qualitatively or quantitatively by systematists. Given this view, three ...

متن کامل

Dental Data Perform Relatively Poorly in Reconstructing Mammal Phylogenies: Morphological Partitions Evaluated with Molecular Benchmarks

Phylogenetic trees underpin reconstructions of evolutionary history and tests of evolutionary hypotheses. They are inferred from both molecular and morphological data, yet the relative value of morphology has been questioned in this context due to perceived homoplasy, developmental linkage, and nonindependence of characters. Nevertheless, fossil data are limited to incomplete subsets of preserv...

متن کامل

The morphological state space revisited: what do phylogenetic patterns in homoplasy tell us about the number of possible character states?

Biological variety and major evolutionary transitions suggest that the space of possible morphologies may have varied among lineages and through time. However, most models of phylogenetic character evolution assume that the potential state space is finite. Here, I explore what the morphological state space might be like, by analysing trends in homoplasy (repeated derivation of the same characte...

متن کامل

Quantification of homoplasy for nucleotide transitions and transversions and a reexamination of assumptions in weighted phylogenetic analysis.

Nucleotide transitions are frequently down-weighted relative to transversions in phylogenetic analysis. This is based on the assumption that transitions, by virtue of their greater evolutionary rate, exhibit relatively more homoplasy and are therefore less reliable phylogenetic characters. Relative amounts of homoplastic and consistent transition and transversion changes in mitochondrial protei...

متن کامل

The morphological state space revisited: what do phylogenetic patterns in homoplasy tell us about the number of possible character states?

Biological variety and major evolutionary transitions suggest that the space of possible morphologies may have varied among lineages and through time. However, most models of phylogenetic character evolution assume that the potential state space is finite. Here, I explore what the morphological state space might be like, by analysing trends in homoplasy (repeated derivation of the same characte...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008